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Abstract 

We consider multi-state capture-recapture-recovery data where observed individu¬ 
als are recorded in a set of possible discrete states. Traditionally, the Arnason-Schwarz 
model has been fitted to such data where the state process is modeled as a first-order 
Markov chain, though second-order models have also been proposed and fitted to 
data. However, low-order Markov models may not accurately represent the underly¬ 
ing biology. For example, specifying a (time-independent) first-order Markov process 
assumes that the dwell time in each state (i.e., the duration of a stay in a given 
state) has a geometric distribution, and hence that the modal dwell time is one. Spe¬ 
cifying time-dependent or higher-order processes provides additional flexibility, but 
at the expense of a potentially significant number of additional model parameters. 

We extend the Arnason-Schwarz model by specifying a semi-Markov model for the 
state process, where the dwell-time distribution is specified more generally, using for 
example a shifted Poisson or negative binomial distribution. A state expansion tech¬ 
nique is applied in order to represent the resulting semi-Markov Arnason-Schwarz 
model in terms of a simpler and computationally tractable hidden Markov model. 
Semi-Markov Arnason-Schwarz models come with only a very modest increase in the 
number of parameters, yet permit a significantly more flexible state process. Model 
selection can be performed using standard procedures, and in particular via the use 
of information criteria. The semi-Markov approach allows for important biological 
inference to be drawn on the underlying state process, for example on the times spent 
in the different states. The feasibility of the approach is demonstrated in a simula¬ 
tion study, before being applied to real data corresponding to house finches where the 
states correspond to the presence or absence of conjunctivitis. 

Keywords: capture-recapture-recovery; dwell-time distribution; hidden Markov model; 

multi-state model. 


1 Introduction 

Capture-recapture studies are often undertaken on wildlife populations. Within these stud¬ 
ies, researchers going into the held at a series of capture events. At the initial capture event 
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all individuals observed are uniquely marked, recorded and released. At each subsequent 
capture event, all individuals observed are recorded, marked if they have not previously 
been observed and again released. The observation of additional dead individuals leads 
to capture-recapture-recovery data. Such studies have been formulated within a state- 


space or hidden Markov model (HMM) framework, e.g., in Ginienez et al. (2007), Royle 


(2008) and King (2012, 2014), which has proven particularly useful for scenarios which 


additionally involve individual-level covariate data. We consider the case of time-varying 
(or dynamic) discrete individual covariates (Lebreton et al., 2009). For example, such 


covariates may relate to breeding status, disease status, foraging/resting, life-stage, mark 
status, etc. The covariate values are often referred to as ‘states’ and we use these terms 
interchangeably. Typically it is assumed that if an individual is observed, their corres¬ 
ponding state is also observed. The observed data are the capture/encounter histories for 
each individual observed within the study. For example, assuming only a single discrete 
covariate, an observed capture history for an individual may be: 

30023000t 

This history represents an individual that is observed at the initial capture time in state 3, 
recaptured at times 4 and 5, in states 2 and 3, respectively, and recovered dead at the hnal 
capture occasion. A ‘0’ indicates that an individual is unobserved at the given capture 
time. 

The Arnason-Schwarz (AS) model is often htted to such multi-state capture-recapture- 

for a 


recovery data (jBrownie et al.\ |1993[ |Schwarz et al.\ |1993[ [King and Brooks 
review see 


2003 


Lebreton et al., 2009). This model typically assumes the state process to be a 


first-order Markov chain. Under time-homogeneity, this involves the implicit assumption 
that the times spent in the states (i.e., the dwell-times) are geometrically distributed, 
so that the length of time an individual remains in a state is independent of the length 
of time the individual has already spent in the state (i.e., the process is memoryless). 
Further, the most likely duration of a stay in a state is one time unit (conditional on an 
individual being alive). These assumptions facilitate the implementation, but are often not 
biologically realistic. In particular, the probability an individual leaves a state is often a 
function of the length of time the individual has been in the state. For example, consider 
the two states corresponding to ‘foraging’ and ‘resting’ — an individual will typically 
forage until its hunger is satiated which in general will be a function of the length of time 
it has been foraging. Alternatively state may correspond to the disease states ‘susceptible’ 
and ‘infected’ with the length of time an individual is infected having a non-geo metric 
distribution — the disease follows a biological process where the recovery time may follow 
a distribution with mode distinct from one. Higher-order Markov processes can be specihed; 


for example Hestbeck et al. (1991), Brownie et al. (1993) and Rouan et al. (2009) consider 


a second-order state process. However, the number of transition parameters increases 
exponentially as the order increases, so that consideration of processes of order greater than 
two is usually infeasible and these models only provide limited additional memory structure. 
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Semi-Markov models provide a significantly more fiexible yet parsimonious specification, 
by specifying a general distribution for the state dwell-time distributions. Corresponding 
models can be interpreted as age-dependent models, where ‘age’ corresponds to length of 
time spent in the given state. This idea has been applied to (single-state) stopover models 


Pledger et al, 


Fewster and Patenaude 

2009 

), while Choquet et al. 

(2013 

2014 


considered multi-state capture-recapture models in which one of the covariate states is 
assumed to be semi-Markovian. 


We develop a general class of semi-Markov AS models, where a distribution on the 
positive integers — e.g., a Poisson distribution shifted by one (so that the support be¬ 
comes 1,2,...), a negative binomial distribution shifted by one, or simply the geometric 
distribution (in which case the state process is Markovian) — is specified on the dwell-time 
for each of the possible states. Together with the conditional state transition probabilities, 
given an individual being alive and a state being left, this determines the semi-Markov 
model for the covariate process. Further, we describe an efficient model-fitting approach 
for the proposed general semi-Markov AS models. The semi-Markov AS model specifica¬ 
tion is substantially more fiexible than a first-order Markov model, yet has only very few 
(if any) additional parameters. In addition, it is possible to apply standard model selection 
techniques (e.g., information criteria or likelihood ratio tests) and test for specific biolo¬ 
gical hypotheses such as the absence or presence of memory in the different states (i.e., 
comparing first-order memoryless Markov models with semi-Markov memory models). 

The manuscript is structured as follows. Section describes the different model for¬ 
mulations and associated estimation methods. A simulation study is provided in Section 
before we analyse capture-recapture data of house finches, where state corresponds to 
presence/absence of conjuctivitis in Section]^ Sectionconcludes with final remarks. 

2 Semi-Markov AS model 

2.1 Mathematical notation 

2.1.1 Observations 

We let N denote the number of individuals observed within the study, T the total number 
of capture occasions and /C = {1,..., IF} the set of possible discrete covariate values. For 
each individual i = 1,..., N, and capture time t = 1,... ,T, 

{ 0 if individual i is not observed at time f; 

k if individual i is observed alive in state k E IC at time f; and 
t if individual i is recovered dead in the interval {t — l,t]. 

Letting tio denote the initial capture time for individual i, we define Wj = {t > Lo • 
Xit E {1,..., iF, t}} as the set of times that individual i is observed, and let W)’ denote its 
complement as the set of times the individual is potentially alive but not observed following 
their initial capture. We assume that covariate values are observed without error. This 
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assumption will be relaxed in Section |4.2| where covariates may be unknown when an 
individual is observed. 


2.1.2 Survival and covariate states 

We now consider the underlying state process and the associated notation. For individual 
i = 1,..., we let Sit denote the true state of the individual at time t, such that, for 
t = Uo, ... ,T: 

{ k if individual i is alive and has discrete covariate value k E 1C at time f; 

K + 1 if individual i is dead at time t, but was alive at time f — 1; 

K + 2 if individual i is dead at time t, and was dead at time t — 1. 

We distinguish “recently dead” (state K + 1) from “long dead” (state K + 2) individuals, 
since it is typically assumed that individuals can only be recovered dead in the same inter¬ 
val that death occurs. Model adaptations removing this assumption are straightforward 
and discussed in subsequent sections. For notational convenience we drop the subscript 
i corresponding to individual, but make it clear in the text that we are referring to the 
individual. 


2.2 Traditional AS model 

2.2.1 Model parameters 

We initially consider the standard multi-state AS model, which assumes a hrst-order 
Markov state process. We allow for both live recaptures and dead recoveries. For j, k E JC, 
the model parameters are dehned as follows: 


Mk) 
Mj: k) 

Pt{k) 

Xt 

T^toik) 


Pr(st+i G IC\st = k) 

Pr(st+i = k\st= j and St+i E 1C) 
Pr(a;i = k\st = k) 

Vi{xt = t|st = K + 1) 

Pr(sto = k) 


{survival probabilities)] 
{transition probabilities)] 
{recapture probabilities)] 
{recovery probabilities)] 
{initial state probabilities). 


2.2.2 Formulation as HMM 

We extend the notation and HMM-type specihcation of the Cormack-Jolly-Seber (CJS) 
model for capture-recapture-recovery data presented by Langrock and King (2013). The 
model is specihed as a partially observed state process, coupled with an observation pro¬ 
cess, dehned conditional on the underlying state. The state process can essentially be 
decomposed into the survival process (i.e., whether or not an individual remains in /C) 
and the transitioning process between covariate states, conditional on survival. The state 
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process has probability mass function (p.m.f.) 

St+i) for St,St+i e }C; 

for St e 1C, st+i = K + 1] 
for St G {K + 1, + 2}, St-|_i = K 2] 

otherwise. 

The p.m.f. for the observation process is dehned similarly: 


f{St+l\St) = < 


1 - (j)t{st) 
1 
0 


f{xt\st) = < 


Vt{.st) 

for 

Xt 

= Sf, 

, St 

G /C 



1 -Pt(st) 

for 

Xt 

= 0, 

St 

G /C; 



At 

for 

Xt 

= t, 

St 

= K 

+ 

1: 

1-At 

for 

Xt 

= 0, 

St 

= K 

+ 

1 

1 

for 

Xt 

= 0, 

St 

= K 

+ 

2 

0 

otherwise. 






It is straightforward to modify this dehnition to deal with scenarios in which “long dead’' 


in¬ 


dividuals can be recovered (see, for example, Catchpole et al ., 2001), with a time-dependent 
recovery probability. For example, the recovery probability can be specihed as a decreasing 
function of time following death. 


2.2.3 Likelihood 

The likelihood is the product over each individual of the corresponding probability of their 
observed encounter history, conditional on their initial capture. The likelihood component 
for an individual initially observed at time to can be calculated by summing over all possible 
unknown states after initial capture. Mathematically, 

T-l 

7r(stJ JJ/(si+i|st)/(a:t+i|st+i). 

Sr&{l,...,K+2} t=to 


This likelihood can be conveniently and computationally efficiently calculated using an 
HMM-type forward algorithm. In order to see this, we dehne the time-dependent transition 
probability matrix (t.p.m.) for the states as 


/ (j)t{i)M^A) 


<Pt{l)MhK) o\ 


rt{xt,xt+i) 


<Pt{K)MK,l) 


0 

V 0 


<Pt{K)MK,K) l-<Pt{K) 0 , 

0 1 

0 1 / 


where 


k) 


Mj, k) for {xt, Xt+i) G {(j, k), (0, k), (j, 0), (0, 0)} ; 
0 otherwise. 
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for j, k ^ K.. Note that when an individual is observed and the corresponding covariate 
value recorded at successive times t and t + 1, the associated Ft matrix reduces such that 
there is only a single non-zero element in the {K x K) upper-left submatrix. Similarly if 
the covariate value is observed at time t and not t + 1, the upper-left submatrix of F^ is 
composed of a single row of non-zero elements; if the covariate value is observed at time 
t + 1 and not time t there is a single column of non-zero elements. 

The state-dependent observation process is expressed as a diagonal matrix of the form; 


( diag(l..., 11 - At, 1) if xt = 0; 
Q^{xt) = l diag(0,...,pt(/^),---,0,0,0) if Xt = A: G/C; 

[ diag(0,..., 0, At, 0) if Xt = f. 


Putting all the pieces together, the corresponding likelihood contribution for the given 
individual, conditional on being initially observed at time t = to, is: 


I'T-l 

£ = -.,. I nr t{Xt,Xt+i)Qt{Xt) 

\t=to 


L/s'+2, 


where lx +2 denotes a column vector of ones of length K + 2, and TTt^ the initial state 
distribution at the initial capture, given that the individual is observed. It will often be 
reasonable to assume that the initial detection does not depend on state, in which case 
the initial state probability can conveniently be taken as the stationary distribution of the 
state process conditional on the animal being alive, i.e., the solution to 


T^’to — 


( ^t(l, 1) 




\ 

MK,K) 


Under this assumption, for an individual observed in state k G 1C at initial capture, is 
the vector of length K + 2 with k-th entry Ttto{k) and all other entries 0. Alternatively, the 
initial state distribution can be estimated. Finally, if we condition on the initial observed 
covariate value, Tr^g is the vector with the kUn entry equal to one and all other entries 0. 


2.3 Semi-Markov AS model 

2.3.1 Non-geometric dwell-time distributions 

We extend the AS model to allow for semi-Markov state processes, relaxing the restrictive 
condition that, while alive, the dwell-time distribution in a state follows a geometric dis¬ 
tribution (under time-homogeneous transition probabilities). The observation process of a 
semi-Markov AS model and the survival, recapture and recovery probabilities are defined 
as for the basic AS model. However, in a semi-Markov AS model, the dwell-time in each 
state, conditional on survival, can follow any discrete distribution on the natural numbers, 
e.g., a shifted Poisson or shifted negative binomial. The state process is determined by 
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the survival probabilities, the p.m.f.s of the state dwell-time distributions, di{r),, ^^(r), 
r = 1, 2,... — with associated cumulative distribution functions Fi{r),..., Fk{t) — and 
the conditional state transition probabilities, given the current state is left: 

'fPtU, k) = P(st +1 = k\st=j and St+i G 1C \ {j}), 

for j 7 ^ k. The probabilities of self-transitions, k), conditional on survival, are determ¬ 
ined by the dwell-time distributions, dk{r). Consequently, the state process is no longer 
Markovian, with the probability an individual leaves a state dependent on how long they 
have been in the state. However, we note that the embedded sequence comprising of simply 
the order of the states visited — e.g., the subsequence 1, 2, 1, 3 of the sequence 1, 2, 2, 2, 
1, 3, 3 — does satisfy the Markov property, and is described by the transition probabilities 
^)- First-order Markov AS models (with time-homogeneous state transition probab¬ 
ilities) are a special case of semi-Markov AS models where all dwell-time distributions are 
geometric. 

A common distribution on the positive integers is the shifted negative binomial distri¬ 
bution, which we denote by sNB(i/, 9). The associated p.m.f. is given by 

p(rj = C r = 1,2,3,... . 

The (shifted) negative binomial distribution represents a generalization of the geometric 
distribution; p{r) gives the probability that r — 1 ‘failures’ occur before u ‘successes’ have 
occurred (z/ = 1 corresponds to the geometric distribution). Using the gamma function, the 
dehnition is easily extended for real-valued u. Other potentially useful dwell-time distribu¬ 
tions include the shifted binomial and the shifted Poisson. More flexible specihcations, e.g., 
using mixtures of Poisson distributions, or even such where one parameter is estimated for 
every point in the support, are equally easy to implement in the framework we propose. 


2.3.2 The likelihood 


Inference can be made by extending the approach presented in Langrock and Zucchini 


(2011) to deal with capture-recapture data, implementing a hierarchical approach of mod¬ 
eling (i) survival state and (ii) covariate states conditional on being alive. More specihcally, 
we formulate an AS model with hrst-order state process so that it mirrors the properties 
of the semi-Markov AS model. The corresponding model formulation can be used to £t 
semi-Markov AS models with any desired dwell-time distributions. The idea is to use 
so-called state aggregates, essentially meaning that each of the semi-Markovian covariate 
states is expanded into a set of Markovian states, structuring the transitions between the 
latter so that the desired dwell-time distributions are represented. Thus, we represent an 
arbitrary semi-Markov AS model (i.e., a hidden semi-Markov model) as a simple AS model 
(i.e., an HMM) with an expanded state space. The advantage of doing so is that we can 
use the HMM representation for conducting statistical inference for the given semi-Markov 
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AS model, which means that the entire HMM machinery — most notably the forward al¬ 
gorithm for evaluating the likelihood — becomes applicable to the semi-Markov AS model, 
without any further amendments being required. 

Let Oi, 02 ,..., Ox be positive integers, and let Oq = 0. We consider a first-order Markov 
chain with a* = YliLi + 2 states, where state /c G /C of the semi-Markov AS model is 
expanded into a set of states. We refer to the sets = |n | cbi < n < Ylt=o 

state aggregates^ and assume that each state of is associated with the same distribution 
for the observation process, namely that associated with state k of the semi-Markov AS 
model. For k = 1,..., K and r = 1,2,..., let Ck{r) = {dk{r)} /{I — Fk{r — 1)} for 
Fk{r — 1) < 1, and Ck{r) = 1 for Fk{r — 1) = 1. The functions Ck play a key role in the 
following; essentially they are responsible for rendering the desired dwell-time distributions. 
The a* X a* t.p.m. of the model with first-order state process that represents the semi- 
Markov AS model is defined as: 




(1 - A(i))i„, 0 \ 


r',(x,,x,+i) 




I : 


0 1 

0 1 / 


( 1 ) 


Here, for {xt,Xt+i) G {{k,k), (0,/c), (/c,0), (0,0)}, the x ak leading diagonal matrix 
k = 1,..., K, is defined as 


A 

l-Cfc(l) 0 

0 \ 


0 




0 

0 

1 

0 

0 1 1) 

^0 

0 

0 1 - Cfc(afc) / 


( 2 ) 


else it is the ak x ak matrix with all entries equal to zero. In the special (geometric) case 
with oa: = 1, we define rki = l-Cfc(l) for {xt,xt+i) G {(/c,/c),(0,A;),(/c,0),(0,0)}, and 
^kkt = 0 otherwise. Furthermore, for {xt,Xt+i) G {(j,/c), (0,/c), (j, 0), (0, 0)}, the aj x ak 
off-diagonal matrix F*^ j, j,k = 1,..., K, j ^ k, is defined as 

/ ij;{j,k)cj{l) 0 ... 0 \ 

r*k,t= . (3) 

\A*tij,k)cj{aj) 0 ... 0/ 


In case of ak = 1, the zeros disappear. Thus, different dwell-time distributions lead to 
different c^’s, while the ^he structure of the t.p.m. F^ (a;*, Xt+i) remain unaf¬ 

fected. The structure of Tffxt-, Xt+i) can be interpreted as follows. Conditional on survival. 
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all transitions within state aggregate Ik are governed by the diagonal matrix which 

thus determines the dwell-time distribution for Ik- The off-diagonal matrices determine 
the probabilities of transitions between different state aggregates. For example, for j ^ k, 
the matrix contains the probabilities of all possible transitions between the state 
aggregates Ij and Ik- Within this model specihcation these conditional state transition 
probabilities, given a state is left, are independent of the length of time already spent in a 
state. 

We now consider the initial state distribution, Using the first-order representation, 
it is straightforward to ht a semi-Markov AS assuming that the initial capture is not 
dependent on state and the state process conditional on the animal being alive, i.e., the 
process restricted to the states {1 ,... ,K}, is in equilibrium at the initial capture time, to- 
To achieve this, let = (7r^^(l),..., be the solution to the equation system 


77 


* 

*0 




r;,?,, \ 

^KK,t / 


Assuming stationarity of the restricted state process, if the observed covariate is k = 
1,... ,K at time to, then is the vector with entries I G Ik given by and all other 

entries equal to zero. Alternatively, the initial state distribution can be estimated, in which 
case the equilibrium distribution within state aggregates can be assumed for numerical 
stability. Finally, if we condition on the initial observed state, we would typically assume 
stationarity of the aggregated states within the initial state distribution. 

We now consider the likelihood 


^T-l 


^ Yl'^Kxt,Xt+l)Q*t{Xt) la*. 


\t=to 


( 4 ) 


The diagonal matrix Q*^{xt) is specified analogously to the definition of Qt{xt) in Section 


2.2.3, with diagonal entry k, k = 1,K, in Qi{xt) repeated times on the diagonal in 


Ql{xt), corresponding to the expansion of state k to the state aggregate Ik- This represent¬ 
ation in general only approximates the semi-Markov model (Langrock and Zucchini, 2011). 
The p.m.f. of the distribution of the time spent in state aggregate A, denoted by d^ir), 
in general differs from dk{r) for r > ak, i.e., in the right tail, since by its definition d\{r) 
exhibits a geometric tail. The difference between dk and can be made arbitrarily small 
by choosing sufficiently large. For any dwell-time distribution with either finite support 
or geometric tail, we can ensure that d^(r) = dk{r) for all r by choosing ak appropriately. 
In summary, the likelihood of the semi-Markov AS model can be approximated by Q, 
where the approximation can be made arbitrarily accurate by choosing sufficiently large 
values ak loT k = 1,..., K. 
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2.4 Inference 

For both the standard AS models and the semi-Markov AS models, for mnltiple indi- 
vidnals, the likelihood is simply the prodnct of likelihoods of type Q, corresponding 
to each observed capture history. It is then a routine matter to numerically maximize 
this joint likelihood with respect to the model parameters, subject to well-known tech¬ 
nical issues arising in all optimisation problems (e.g., parameter constraints, numerical 
underflow, local maxima). Approximate confidence intervals for the parameters can be 
obtained based on the inverse of the estimated information matrix, or, alternatively, us¬ 
ing a parametric/non-parametric bootstrap. Model selection, including for the underlying 
covariate process model, can easily be carried out using model selection criteria such as 
Akaike Information Criterion (AIC). 

3 Simulation study 

A simulation study was conducted to demonstrate the feasibility of the semi-Markov ap¬ 
proach and to investigate potential consequences of htting simpler Markovian models when 
the operating model is semi-Markov. We consider a semi-Markov AS model with three un¬ 
derlying covariate states, each with a different dwell-time distribution, namely a shifted 
negative binomial (with parameters v = A and 9 = 0.4), a shifted Poisson (with mean 
/i = 4) and a geometric (with parameter 9 = 0.4), respectively. We assume that the 
other model parameters are constant over time, and so omit the t subscript for each of 
the parameters. The conditional state transition probabilities (given the covariate state 
is left) are specified as ^’*(1)2) = 0.6 = 1 — = 0.8 = 1 — ^’*(2,3) and 

'0*(3,1) = 0.5 = '0*(3,2). The (state-dependent) survival probabilities are specified as 
0(1) = 0.8, 0(2) = 0.9 and 0(3) = 0.6, and recapture probabilities are specified as 
p(l) = 0.2, p{2) = 0.1 and p{3) = 0.5. The (state-independent) recovery probability is 
taken to be A = 0.2. 

We simulated 1000 datasets, each with N = 500 individuals and T = 20 capture 
events. For each dataset, we fitted (i) the correctly specified semi-Markov AS model and 
(ii) the corresponding incorrect first-order Markov model. The latter was done in order 
to investigate potential consequences of not accounting for a semi-Markov structure in 
the covariate process. Figure displays the p.m.f.s of the three dwell-time distributions, 
together with the estimated distributions obtained in the first 10 simulation runs when 
using either the hrst-order or semi-Markov model specification (these are representative 
of the results obtained for all the simulated datasets). Table summarizes the results of 
the remaining parameter estimates obtained using the two different model formulations, 
by providing the mean relative biases and mean standard deviations. 

As expected, the dwell-time distribution is inaccurately estimated when fitting the 
incorrect geometric distribution (i.e., the first-order Markov model) for the two states with 
non-geometric distributions. We note that, due to its higher flexibility, the uncertainty in 
the estimation of the negative binomial state dwell-time distribution (in state 1) was found 
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Table 1: MRB: mean relative bias of estimates; MSD: mean standard deviation of estim¬ 
ates. 


semi-Markov model first-order Markov model 


MRB MSD MRB MSD 


V^*(l,2) 

-0.01 

0.14 

0.14 

0.15 

^*(2,1) 

0.01 

0.10 

0.05 

0.10 

1 — 1 

CO 

* 

0.01 

0.22 

-0.16 

0.27 

0(1) 

0.00 

0.03 

0.00 

0.04 

■>(2) 

0.00 

0.05 

0.00 

0.05 

■>(3) 

-0.01 

0.08 

-0.01 

0.08 

P(l) 

0.03 

0.04 

0.09 

0.07 

P(2) 

0.05 

0.03 

0.05 

0.05 

P(3) 

0.11 

0.19 

0.15 

0.20 

A 

0.01 

0.02 

0.00 

0.02 

state 1 



state 2 






state 1 


state 2 


state 3 


CD 
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o 
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> 



CD 
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Figure 1: Simulation study: true (black crosses) state dwell-time distributions and the 
corresponding estimates (grey lines) obtained in the hrst 10 simulation runs; the top plots 
show the results obtained when using the semi-Markov model, and the bottom plots those 
when using the incorrect hrst-order Markov model. Note that d{r) denotes the p.m.f. for 
the dwell-time distribution, i.e. the probability an individual stays in a given state for 
r = 1, 2,... capture events. 


to be much higher than for the dwell-time distributions in the other two states (cf. the top 
row of plots in Figure [^. In general, the dwell-time distributions of the states may be of 
interest themselves, e.g., where state refers to life stage which in turn may be used to infer 
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future population trajectories. 

The semi-Markov approach led to approximately unbiased estimates for all the para¬ 
meters except the recapture probabilities (see Table [^. These displayed a positive bias as 
a result of small-sample properties of the estimation: all mean relative biases decreased to 
values < 0.005 when sample size was increased by a factor of 10. We note that the recapture 
parameters are essentially nuisance parameters, i.e., not of primary interest. In contrast, 
fitting the first-order model led to biased estimates for the conditional state transition prob¬ 
abilities, despite the fact that the embedded subsequence corresponding to the order the 
states occur in (ignoring duration in each state) is first-order Markovian. Clearly, the in¬ 
accurate estimation of the dwell-time distribution and biased estimation of the conditional 
state probabilities can be problematic when interest lies in the state-switching dynamics. 
However, interestingly, the survival probabilities were generally robust with regard to the 
mis-specihcation of the dwell-time distributions, at least for the set of parameter values 
used in this simulation study. 


4 Application to Carpodacus mexicanus 

4.1 Data 


We consider multi-state capture-recapture histories for house finches [Carpodacus mexic¬ 
anus) collected at a study site in Ithaca, New York, between September and December 
2002. There are no recoveries in this study. During the study period a total oi N = 813 in¬ 
dividuals are observed over the T = 16 capture occasions. The covariate states correspond 
to the presence/absence of Mycoplasma gallisepticum conjunctivitis, so that K = 2. We let 
state 1 denote the absence of conjunctivitis and state 2 the presence of conjunctivitis. For 
these data, it was not always possible to determine the presence/absence of conjunctivitis 
(i.e., the state) of an individual bird observed at a given capture occasion; the state is 
unknown for 4.1% of the recorded observations. The data are provided in the Web Ap¬ 
pendix. Particular interest lies in (i) the survival probabilities for the two states and (ii) 
the dwell-time distributions for the two states. For further details of the data collection 


process and the biological mechanisms, see for example, Faustino et al. (2004) and Conn 


and Cooch (2009). 


4.2 Semi-Markov AS model with unobserved states 

We extend the semi-Markov AS model discussed above to account for the possibility of the 
covariate state of an individual, here presence/absence of conjunctivitis, to be unknown 
when an individual is observed. We define at{k) to be the assignment probability that 
the state of an individual is recorded, given that the individual is observed at time t and 
is in state k (i.e., St = k). For the special case where all covariates are always observed 
(without error), the corresponding p.m.f. is trivial (a point mass on the true values, so 
that at[k) = 1 for all k G /C). Alternatively, if the covariate values are missing at random. 
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then at{k) = at for all k E 1C, and simply corresponds to the probability that the covariate 
value is recorded independently of state, given the individual is observed. Finally, if the 
covariate values are not missing at random, then in general at{j) 7 ^ Cit{k) for j 7 ^ k. 
The corresponding likelihood in the presence of unobserved states is provided in the Web 
Appendix for both the standard AS model and generalised semi-Markov AS model. The 
extension to partially observed covariate values, or covariate values observed with error 
(i.e., false reporting of covariate values) can be similarly incorporated but is omitted here 
to avoid additional complexity (see King and McCrea (2014) for further discussion of such 
models). 


4.3 Models considered 


A hrst-order Markov AS model, allowing for unknown state when an individual is observed, 
has been htted to these data by Conn and Cooch (2009), Schoheld and Barker (2011) and 


King and McCrea (2014). In particular. King and McCrea (2014) £t a range of models to 


the data and identify the following model as optimal via the AIC statistic: 


(j)(t + c) 
p{t + c) 

a{c) 


time (t) and covariate (c) dependent survival prob., additive on the logit scale; 
time (f) and covariate (c) dependent capture prob., additive on the logit scale; 
hrst-order Markov transition probabilities (time independent); 
covariate (c) dependent assignment probabilities. 


We extend the above model to allow for a semi-Markov state process. In particular, we 
consider each covariate dwell-time distribution to be either a (shifted) negative binomial 
or a geometric distribution. Thus, we initially consider four possible models. Specifying 
a (shifted) Poisson distribution led to signihcantly worse htting models, so that we omit 
these results. Since it is not clear if it is reasonable to assume that the covariate process is 
in its stationary distribution at the initial capture occasion, we tried two different model 
formulations; (i) using the stationary distribution as initial distribution; (ii) estimating 
an additional parameter giving the probability of being in the conjunctivitis state at the 
initial capture, but assuming stationarity within state aggregates. The resulting parameter 
estimates obtained using the two different models were very close to each other. In partic¬ 
ular, for the conhguration of dwell-time distributions favored by the AIC, the probability 
of being in the conjunctivitis state at the initial capture was estimated as 0.07 when using 
(i) and as 0.05 when using (ii). In the following, we present the results obtained using the 
simpler model formulation (i). 


4.4 Results 

Table provides the estimates of the parameters of the dwell-time distributions and the 
corresponding AIC values for each of the models considered. The model deemed optimal 
specifies a shifted negative binomial dwell-time distribution for state 1 and a geometric 
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dwell-time distribution for state 2 (though the AAIC values are small for each of the other 
models). The htted dwell-time distributions, for the selected hybrid Markov/semi-Markov 
AS model and the standard (Markov/Markov) AS model, are displayed in Figure]^ Fur¬ 
ther, we removed the dependence of the survival and recapture probabilities on time and/or 
state. Based on the AIC the same structural dependence is retained, though the model 
with time-constant survival probabilities performed only slightly worse (AAIC=2.941). 

Table 2: Model selection for different dwell-time distributions for the house hnch data 
using the AIC statistic, where state 1 corresponds to the absence of conjunctivitis and 
state 2 to the presence of conjunctivitis. The MLEs of the parameters for the dwell-time 
distributions are provided in brackets. The remaining model parameters are of the following 
form; 4>{t + k)jpit + k)ja{k). 


Dwell-time distributions 


State 1 

State 2 

AIC 

AAIC 

geom(0.028) 

geom(0.382) 

5539.211 

0.626 

sNB(0.581,0.017) 

geom(0.409) 

5538.585 

0.0 

geom(0.029) 

sNB(0.612,0.287) 

5540.508 

1.923 

sNB(0.536,0.016) 

sNB(0.463,0.265) 

5539.144 

0.559 


dwell-time distribution in state 1 (no conjunctivitis) 


X semi-Markov model 
o first-order Markov model 


> f ¥: ?l. 


^ ^ ^ , 1 . 


dwell-time distribution In state 2 (conjunctivitis) 




x>x>x>>o>o>ox>x>>ox>x>x>x> 

—^- 1 -^-' 

10 15 20 


Figure 2: House hnch case study: Fitted state dwell-time distributions as obtained using 
the semi-Markov model (with shifted negative binomial state dwell-time distribution for 
state 1 and a geometric state dwell-time distribution for state 2; crosses) and the hrst-order 
Markov model (circles), respectively. The vertical lines indicate 95% pointwise conhdence 
intervals obtained using the observed Fisher information. Note that d{r) denotes the p.m.f. 
for the dwell-time distribution, i.e., the probability an individual stays in a given state for 
r = 1, 2,... capture events. 

As previously found, the survival probabilities for individuals with conjunctivitis are 
lower than those for individuals without conjunctivitis present. The differences in the 
survival probability estimates obtained with the semi-Markov model and the hrst-order 


-H - 

Preprint 













Markov model, respectively, are small (see Figure |^. The semi-Markov model predicts 
higher probabilities for an individual to contract conjunctivitis for short dwell-times in 
the covariate state corresponding to the absence of conjunctivitis. This suggests that an 
individual that has very recently recovered from conjunctivitis has a higher probability 
of contracting conjunctivitis than an individual that has not suffered from the disease for 
a longer duration. (An alternative explanation may be that of misclassification, where 
individuals with conjunctivitis are misdiagnosed as healthy — we do not model possible 
misclassifications here). For the semi-Markov model, the stationary distribution for the 
states is (0.933,0.067). In other words, the stationary distribution indicates that about 
6.7% of the population will have conjunctivitis at any given time (assuming the individu¬ 
als observed are representative of the whole population). Assuming a first-order Markov 
process for both states led to a stationary distribution with 6.9% of the population with 
conjunctivitis. 


state 1 (no conjunctivitis) state 2 (conjunctivitis) 
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Figure 3: House hnch case study: State- and time-dependent survival probabilities estim¬ 
ated using the semi-Markov model (crosses) and the first-order Markov model (circles), 
respectively. 
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5 Discussion 

We extended the AS model for multi-state capture-recapture-recovery data, by permit¬ 
ting the inclusion of memory within the evolution of the states via the specihcation of 
a semi-Markov state process. Within this model, memory is incorporated by specifying a 
parametric distribution on the length of time an individual remains in each given state (i.e., 
specifying a dwell-time distribution for each state). Corresponding models are attractive 
due to the increased level of biological realism they permit with a very limited increase 
in the number of additional parameters needed, in contrast to many previous attempts 
of introducing memory by using higher-order Markov models. Ignoring memory present 
within the state process of the model can lead to biased inference, most notably with regard 
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to the conditional state transition probabilities. Interestingly, in our simulation study, no 
bias was observed in the survival probabilities when htting an incorrect hrst-order Markov 
model. However, in general the dwell-time distributions may themselves be of interest, 
e.g., where state refers to life-stage or disease status. In such cases the corresponding sta¬ 
tionary distribution for the states may also be of interest for future predictions. Further, it 
is possible to test biological hypotheses with regard to the absence or presence of memory 
in the state process. 

In our approach to inference in semi-Markov AS models, we have followed the analog¬ 
ous approach to that suggested by Langrock and Zucchini (2011). The underlying idea is 
to structure a hrst-order Markov process such that it accurately represents a semi-Markov 
process, with the crucial advantage that the efficient recursions available for hrst-order 
models become applicable also to semi-Markov models. Various similar approaches have 
been discussed in the literature. A good overview of these is given in Johnson (2005), who 
argues that the usage of this kind of model is “almost certainly a much better practical 
choice for duration modeling than development and implementation of more complex and 
computationally expensive models with explicit modihcations to handle duration probabil¬ 
ities”. While the approach in theory could have computational limitations — as argued by 
Choquet et al. (2014) — those would occur only if for some of the dwell-time distributions 
the support would be very large, covering several hundred time units (i.e., capture occa¬ 
sions). This is unlikely to occur in capture-recapture scenarios, with usually only relatively 
short time series of capture occasions. 

The semi-Markov AS model was applied to house-hnch data, where an indication of 
memory was found within the dwell-time distribution of an individual being in the state 
of non-conjunctivitis. In other words, the length of time for which an individual contracts 
conjunctivitis depends on when it was last infected (i.e., reinfection of conjunctivitis is 
more likely immediately following their previous recovery from infection). However, the 
length of time an individual does suffer from conjunctivitis appears to be well modeled via 
a hrst-order Markov model (i.e., recovery from conjunctivitis is independent of the length 
of time an individual has suffered from conjunctivitis). 

The approach described within this paper can immediately be extended to other forms 
of capture-recapture models including multi-event models (Pradel, 2005) and partially 
observed capture-recapture-recovery models (King and McCrea, 2014), where states are 
not directly observed or observed with error. We note that the multi-event model has 
exactly the form of a standard hidden (semi-)Markov model, where no states are directly 
observed. In these cases, the transition process remains the same, and only the observation 
process needs to be modihed in terms of the assignment probabilities associated with 
different events or partial states, conditional on being in each given state. This again 
emphasizes the advantage of specifying the capture-recapture (-recovery) model as a state- 


space (or hidden Markov) model, separating the system and observation processes (King 


2014). For example, being able to calculate the likelihood makes it straightforward to 


estimate model components nonparametrically, via penalized likelihood (Michelot et al. 


2015). The extension to multiple covariate processes (e.g., site and breeding status) is also 
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conceptually straightforward, but the curse of dimensionality renders the model-htting 
more computationally intensive. 

The semi-Markov AS model described currently assumes that the remaining model 
parameters, most notably the survival probabilities, are time- and/or state-dependent. A 
similar semi-Markov (state-independent) model can be proposed for the survival prob¬ 
abilities, essentially specifying a parametric distribution on survival, using the analogous 
approach as that for the state transition probabilities. However, the extension to include a 
state-dependent semi-Markov survival model is non-trivial if individuals can move between 
different states. Alternatively, the inclusion of additional covariate information provides a 
further modeling complexity of interest within such models. The development and htting 
of such models is the focus of current research. 
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Appendix: Likelihood for (semi-Markov) AS model with 
unobserved states 

Here we extend the semi-Markov AS model discussed in Section [2]3] in order to account for 
the possibility of the covariate state of an individual — presence/absence of conjunctivitis 
in the house finch real data case study — to be unknown when an individual is observed. 
We expand the set of possible values observed within a capture history to include an 
“unknown” state. For individual i = 1,... ,N and f = 1,... ,T, we define the additional 
state: xu = —1 if individual i is observed at time t but where the associated covariate 
value is unknown. Thus, for each individual i = 1,..., N and capture time t = 1,... ,T, 
the observed data are given by: 

if individual i is observed at time t but where the covariate state is unknown, 
if individual i is not observed at time t, 
if individual i is observed alive in state k E 1C at time t, and 
if individual i is recovered dead in the interval {t — l,f\. 

As before, for individual i = 1,..., A^, we let denote the true state of the individual at 
time t, such that, for t = tiQ ,..., T: 

{ k if individual i is alive and has discrete covariate value k E K, at time t, 

K + 1 if individual i is dead at time t, but was alive at time t — 1, 

K + 2 if individual i is dead at time t, and was dead at time t — 1. 

We initially describe the likelihood for the AS model in the presence of live recaptures 
and dead recoveries where covariates may be unknown before extending to the semi-Markov 
case. 


Xit = 


-1 

0 

k 

t 
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Standard AS model 

Recall that the likelihood contribution for an individual, conditional on being initially 
observed at time t = to, is given by: 


/T-l 


^ — TTto n Tt{xt,xt+i)Qt{xt) 1 


K+2- 


\t=to 


Allowing for unobserved covariate states, the observation process is the diagonal matrix; 


Qti^t) = < 


f diag(pt(l)(l - at(l)),... ,pi(A')(l - at(A')),0,0) ii Xt =-1; 
diag(l - pt{l), ..., 1 - Pt{K), 1 - Xt, 1 ) if Xt = 0; 

diag(0,... ,pt{k)at{k), ... ,0,0) ii Xt = k elC] 

diag(0,..., 0, At, 0) if Xt = f- 


Recall that at{k) is dehned to be the assignment probability that the state of an individual 
is recorded, given that the individual is observed at time t and is in state k (i.e., St = k). 
The transition probability matrix for the covariate process, Tt{xt, Xt+i), remains unchanged 
except the dehnition of iptij, k) is extended such that. 


( Mj, k) for {xt, xt+i) e {(j, k), (0, k), (j, 0), (0,0), (-1, k), (j, -1), 

U3,k) = l (- 1 , 0 ), ( 0 ,- 1 ), (- 1 ,- 1 )}; 

0 otherwise, 

for j, /c G /C. In other words, the set of possible values of Xt and Xt+i are extended 
for which '^^(j,/c) = ipt{j,k), to take into account the additional possibility of unknown 
covariate values when an individual is unobserved. 


Semi-Markov AS model 

The extension to the semi-Markov case permitting unknown covariate values when an 
individual is observed follows in the analogous manner as to that presented in Section 2.3 
(where covariates are always known when an individual is observed). In particular, each 
state is expanded into a set of state aggregates and the likelihood corrected accordingly. 

Recall that for the semi-Markov AS model the likelihood contribution for an individual 
initially observed at time to is given by 

\t=to / 

The term Ql{xt) is specihed analogous to the dehnition for the AS model with unob¬ 
served states, but allowing for the state aggregates. In other words, the diagonal entry 
k = in Qi{xt) is repeated au times in the diagonal matrix Ql{xt). The mat¬ 

rix Tl[xt,Xt-\-i) is dehned as in Equation (1) in the main manuscript, but where the 


- 20 - 
Preprint 



leading diagonal matrix is defined as in Equation (2) for the set of observations 

{xt,xt+i) e {(A;,A;),(0,A;),(A;,0),(0,0),(-1,A;),(A;,-!),(-!, 0),(0,-!),(-!,-1)}. Simil¬ 
arly, the off-diagonal matrix is defined analogously as in Equation (3). 
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